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MODELING, SIMULATING, AND PARAMETER FITTING OF 
BIOCHEMICAL KINETIC EXPERIMENTS 

D. GOULETt 


Abstract. In many chemical and biological applications, systems of differential equations con¬ 
taining unknown parameters are used to explain empirical observations and experimental data. The 
DEs are typically nonlinear and difficult to analyze, requiring numerical methods to approximate the 
solutions. Compounding this difficulty are the unknown parameters in the DE system, which must 
be given specific numerical values in order for simulations to be run. 

Estrogen receptor protein dimerization is used as an example to demonstrate model construc¬ 
tion, reduction, simulation, and parameter estimation. Mathematical, computational, and statistical 
methods are applied to empirical data to deduce kinetic parameter estimates and guide decisions re¬ 
garding future experiments and modeling. The process demonstrated serves as a pedagogical example 
of quantitative methods being used to extract parameter values from biochemical data models. 
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1. Introduction. The empirical study of many current biological problems gen¬ 
erates large and complex data sets. How best to use this data to generate and improve 
scientific hypotheses is a subject of great interest to biologists, mathematicians, statis¬ 
ticians, and computational scientists. Combining these various scientific and quanti¬ 
tative disciplines requires careful communication. Figure 11.11 illustrates the flow of 
information for a typical problem in quantitative biology. 

The modeling process described in the present work began with the formation of a 
scientific hypothesis based on laboratory experiments and intuition. This theory was 
used by biochemists to construct an experimental protocol and generate data. The 
same theory was used by mathematicians to develop a mathematical model, which was 
studied analytically and simulated computationally. Both models were intended to 
confirm or improve hypotheses. The theory and experiments are described in Sj2]while 
the mathematical and computational models are described in and respectively. 

For the biological model to feed back on theory, its output data needed to be 
analyzed. For the quantitative model to feed back on theory, its unknown parameter 
values needed to be estimated, so that meaningful simulations could be performed. 
Data modeling allowed empirical evidence to be combined with computational simu¬ 
lation as a means of generating parameter estimates and furthering biological theories. 
This process is described in IJH 

As Figure [13 suggests, the deduction of parameter estimates and confirmation of 
scientific hypotheses is not the end of the modeling process. Indeed, the data model 
answered some questions while raising others, prompting a new cycle of modeling 
which is now underway. Conclusions and future modeling directions are discussed in 

m 

2. Biochemical Modeling. Cells contain a vast array of proteins serving a 
multitude of functions. One class of protein is called a receptor. The specific type of 
receptor of interest is the estrogen receptor ERa, found in various cell types, including 
human breast and ovary cells. This receptor is localized to the nuclear membrane 
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Fig. 1.1. The flow of information for the dimer exchange model. 


and binds to estrogen molecules, ultimately leading to the expression of genes and 
the production of proteins. Because these receptors have been correlated to breast 
cancer, they are of interest to biochemists and drug developers [5]. 

Understanding the molecular structure of ERa, and the mechanism by which it 
acts, can be approached on many experimental and analytical levels. Below a model is 
developed and analyzed to fit data gathered by biochemists attempting to understand 
the rates of formation of estrogen receptors in vitro [7] . 

The laboratory protocol employed is known as the Dimer Exchange Assay. It 
makes use of size exclusion chromatography to separate molecules of different sizes. 
For completeness, a brief description of this type of chromatography is given, followed 
by a more detailed description of the dimer exchange assay itself. 

Size exclusion chromatography is a technique used to separate molecules of dif¬ 
ferent sizes and quantify the concentrations of molecules as they change over time. It 
is an effective tool for understanding the rates at which biochemical reactions occur. 
Reacting molecules are placed on one end of a glass column filled with starchy beads. 
As the molecules diffuse and convect through the column, their interactions with the 
beads, which vary depending on the geometry of the molecule, cause the molecules to 
be separated by size. The data extracted from the chromatography column helps to 
reconstruct the time course of concentrations. A typical time course is illustrated in 
Figure [T3l 

Models of the size-exclusion chromatography mechanism are not pursued here. 
But there has been activity in this area of theoretical and computational modeling |191 
EH [55]. 

The dimer exchange assay is illustrated in Figure (2^ The ligand binding domain 
protein (LBD) of the estrogen receptor exists in monomeric form, but spontaneously 
dimerizes in solution. The fusion monomer is created by the addition of maltose 
binding protein to the LBD. This leads to the formation of two other dimer types, 
the fusion dimer and the heterodimer. See Figure ETTl 
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Fig. 2.1. The five proteins of the dimer exchange assay. Maltose Binding Protein is fused with 
the Ligand Binding Domain protein of ERa. The two monomer types dimerize in three ways. 


It’s not possible to produce solutions of free LBD monomers and observe the 
rate at which dimers form. A dilution of an equilibrated dimer/monomer solution 
would force concentrations away from equilibrium and allow kinetics to be observed. 
However, in the case of LBD estrogen protein, this equilibration occurs much too 
rapidly (on the order of two minutes) to be observed experimentally. To overcome 
this, the dimer exchange assay is performed. Once the solutions of homodimers are 
mixed, equilibrium is approached over roughly a 24 hour period. This allows sufficient 
time for a dimer concentration time series to be recorded. 



Fig. 2.2. The sequence of events in the dimer exchange assay. Solutions of LBD and fusion 
homodimers are prepared and allowed to equilibrate before being mixed. Free monomers are present 
in both mixed and unmixed solutions, but in relatively low concentrations. At regular time intervals, 
a sample is extracted from the mixture and injected into a high pressure liquid chromatography 
column. The areas of resulting chromatograms (black dotted curves) indicate the amount of protein 
in solution around the time of the injection. Peak fitting algorithms are applied to the chromatograms 
to determine the amounts of LBD dimer (left red curves), fusion dimer (right blue curves), and 
heterodimer (center purple curves) present in the effluent. Free monomer concentrations are below 
the noise threshold and can’t be reliably estimated. 


The remainder of this article is presented as follows. In lj3] estrogen receptor ex¬ 
periments are described and a mathematical model of the dimer exchange assay is 
constructed. The model is reduced using conservation laws and experimental obser¬ 
vations. In 12] computational methods for simulating and analyzing the model are 
described. Numerical simulations and optimization algorithms provide estimates of 














4 


D. Goulet 


1.5 


T. □ 

Q B, 


0.5 


0 0 




■□ ■ ■ ■ e.Q ■ 


■ o ■ •. o ■ ■ o. 


■o.o - 


G - 


o Dimer, Du 
□ Fusion Dimer, D 22 
0 Heterodimer, D 12 


10 15 20 25 30 35 

hours 


Fig. 2.3. Data from a dimer exchange assay. See Tahle VA~l\ in AvvendixV^ 


biochemical parameters. In data analytic techniques are used to assess the accu¬ 
racy of these estimates. In ^a general procedure for approaching similar biochemical 
problems is outlined, highlighting the importance of interaction between modelers and 
experimenters. Computational algorithms, experimental data, and related informa¬ 
tion are provided in the appendix. 

3. Mathematical Modeling. 

3.1. Reaction Rate Laws. Estrogen Receptors are dimers not monomers, i.e., 
made of two proteins not one. The two monomeric proteins are identical. Letting D 
represent the dimer and M represent the monomer, the reversible chemical reaction 
forming dimers from monomers is represented as follows. 

(3.1) M + M^D 

When formed in vitro, monomers equilibrate with dimers rapidly, on the order of 
minutes. The chromatography experiments under consideration require on the order 
of hours to perform. Hence, the experimental technique is unable to resolve the 
reaction rates. 

In the dimer exchange assay, some of the monomers are tagged with extra atoms, 
making them chemically distinct from the normal type, but without appreciably al¬ 
tering the way in which dimers form. The altered monomer is said to be of the fusion 
type. 

Assumption 1. The addition of MBP to LBD to create the fusion protein does not 
appreciably alter dimerization kinetic parameters. 

The two types of monomers are now labeled Mi and M 2 . 

(3.2a) Ml -j- M^i ^ Dll 

(3.2b) M 2 -f M 2 ^ D 22 

(3.2c) Ml -|- M 2 ^ Di 2 

Biochemists call Du and D 22 homodimers and D 12 a heterodimer. See Figure 12.11 
Understanding the rates at which these dimers form is essential to an understanding 
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of their biological actions. The dimer exchange assay creates a slowly equilibrating 
system, allowing reaction rates to be more easily studied. See Figure [13] for a typical 
time course. 

The Law of Mass Action [Ill Ell [46] states that the rate at which chemical 
reactions occur is proportional to the products of concentrations of the chemically 
reacting species. Applying this law to reaction (13.11) yields a system of differential 
equations. 

(3.3a) — = -2k+M‘^ + 2k_D 

at 

(3.3b) ^ = k+M^ - k_D 

The dependent variables M{t) and D{f) are concentrations of monomers and dimers, 
respectively. The coefficients k+ and fc_ are rate parameters. They are not rates, 
because their units are l/concentration*time and 1/time, respectively. Rates have 
units of concentration/time. Also note the factor of 2 in equation ()3.3a|l . This is due 
to the 2:1 stoichiometric ratio of monomers to dimers. 

Applying the law of mass action to p.2l) leads to a more complex model. 


(3.4a) 

dMi 

dt 

— —2/pi + 2ki_—Dii 

— k^^-\.MiM2 

+ k^^-Di2 

(3.4b) 

dAd2 

dt 

= —2k2,+M2 2k2,—D22 

— k^^-\.MiM2 

+ ^3,--Di2 

(3.4c) 

dDii 

= — fci _Zlii 



dt 



(3.4d) 

dD22 

= k2. + M2 — k2. — D22 



dt 



(3.4e) 

dDi2 

= k2,^+MiM2 — k3-Di2 



dt 




The constants ki^± are the rate parameters for the six reactions in system 13.21 A 
subscript + (—) denotes a forward (backward) rate constant. The subscript indices 
f = 1,2,3 correspond to reactions 13.2k . b.c. respectively. 

It has been implicitly assumed that concentrations are altered only by a closed 
system of chemical reactions. While it is certain that no source of monomers or 
dimers exists in the experimental apparatus, it is an outstanding question whether 
appreciable losses of protein are occurring by aggregation or binding to the glass of 
the column. Under the conditions of the experiment, proteins should be stable, but 
the possibility of degradation can’t be completely ruled out either. 

Assumption 2. Protein does not exit the closed chemical reaction system described 
by Equations (|3.4I) during the dimer exchange assay. 

This assumption will be revisited in S[51 

3.2. Model Reduction. System (13.41) has five unknown concentrations. Ex¬ 
amining experimental data from Figure 12.31 and Table lA.ll shows that biochemists 
don’t have measurements for all five concentrations. This is due to difficulty of dis¬ 
tinguishing between small monomer concentrations and noise in the chromatogram. 
System ()3.4p also contains six unknown rate constants. Again, the experimental data 
shows there will only be three curves to fit, which may be insufficient to accurately 
approximate six parameters. 











6 


D. Goulet 


3.2.1. Conservation Laws. By examining the simple monomer scenario (EU, 
it is intuitively clear that monomers are never destroyed, they are simply incorporated 
into dimers. Each time a dimer forms, two monomers are used. So the total concen¬ 
tration of monomers, counting both the free and dimerized forms, is 2D -\- M, and 
this quantity should be unchanging in time. Combining equations p.3al) and (I3.3bl) 
confirms this. 

— {2D + M) = 2— + — = 2 (k+M^ - k-D) - 2k+M‘^ + 2k-D = 0 
at at at 

The quantity 2D + M is conserved. As a result, it is equal to its initial value. 

2D{t) + M{t) = 2D{to) + M{to) = a 
Using this to replace M in equation (I3.3bl) yields 

(3.5) ^ = k+{a-2Df - k-D. 

Evidently, conservation laws enable the reduction of a system of ODE [46]. The 
trajectory of (M, D) in two-dimensional concentration space is restricted to a one¬ 
dimensional stoichiometric subspace [18] . 

The same perspective applied to the dimer exchange system ([32]), reveals that 
each of the two types of monomers aren’t destroyed, they only change form. This 
provides two conservation laws. 

(3.6a) ^ (Ml -b 2Dii -b £> 12 ) = 0 

(3.6b) ^ (M 2 + 2 D 22 + D 12 ) = 0 

These can be rephrased as algebraic equations involving initial conditions. 

Mi{t) + 2Dii{t) -b Di2{t) = Ml (to) + 2Zlii(to) + ^ 12 (^ 0 ) = Oii 

M2{t) + 2D22{t) -b Di2{t) = M2(to) + 21122 (^ 0 ) + Dl2{to) = Oil 

When applied to system (EH), these conservation laws reduce the model. 

(3.8a) ~~dt~ ~ ~ 2I?ii — D 12 Y — ki^-Dii 

(3.8b) ~ k2,+ {a2 — 2 .D 22 — D 12 Y — k2-D22 

(3.8c) = fco _|_(ai — 2Dii — Di 2 ){a 2 — 2 D 22 — D 12 ) ~ D 12 

at 

Identifying two conservation laws was done intuitively. If the reaction system were 
more complex, intuition alone could be insufficient to identify these laws. Also, this 
system could allow additional conservation laws. A complete treatment requires find¬ 
ing others or proving that there are no others. 

Though system (13.411 is nonlinear, there is a reformulation which reveals under¬ 
lying linearity. Chemical reactions are modeled by autonomous systems, x' = F{x), 
with F a nonlinear mapping from species to rates of change of species. This mapping 
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can be decomposed, F{x) = L^{x), where $ is a mapping from species to complexes 
and L is a linear mapping from complexes to rates of change of species. 


_d 

dt 


Ml 


—2/ci,+ 

0 

-k3,+ 

2fci,_ 

0 

^3,- 

M 2 


0 

— 2^2,+ 

-k3,+ 

0 ’ 

2k2,- 

ks- 

Dll 

= 

ki,+ 

0 

0 

-ki- 

0 

0 

D 22 


0 

^2.-1- 

0 

0 

-k2.- 

0 

Di2_ 


0 

0 

k3,+ 

0 

0 

-k3.- 


-| 

Mf 


Mi 


M 1 M 2 


Dll 


D 22 

- 

Di2 


Definition 1. Given a vector of species concentrations x, a conservation law for a 
chemical reaction system is any constant linear combination of concentrations which 
is unchanging in time. 


0 = — {viXi + V2X2 + . . . + VnXn) 

dt 



A conservation law is uniquely represented (up to scalar multiples) by the correspond¬ 
ing vector V. This definition motivates a means of identifying conservation laws. 
Theorem 1. Let x' = L$(x) be a chemical reaction system whose right hand side 
has been decomposed as described above. Then v is a conservation law of the system 
only if V € ker . 

Proof. Suppose u is a conservation law for x' = L$(x). 

= v'^x' = (v’^x)' = 0 


This necessary condition is applied to dimer exchange. 


L^v = 


-2fci,+ 

0 

ki,+ 

0 

0 

0 

— 2^2,+ 

0 

^2,-1- 

0 

-k3,+ 

-k3,+ 

0 

0 

k3,+ 

2ki,- 

0 

-ki.- 

0 

0 

0 

2fe.- 

0 

-k2- 

0 

ks- 

k3- 

0 

0 

-k3- 


Vl 


O' 

V2 


0 

V3 

= 

0 

VA 


0 

y^. 


0 


The kernel of is given by 


u = ui [1 0 2 0 


lf + V2[0 1 0 2 



□ 


The conservation laws form a two dimensional space spanned by the intuited con¬ 
servation laws in system (iTbl) . When analyzing large networks, or small networks 
where intuition is lacking, this method of identifying all possible conservation laws is 
effective and simple to implement. 

It’s notable that the conservation laws for the dimer exchange problem are inde¬ 
pendent of the rate parameters. Further decomposition of L allows many interesting 
conclusions to be drawn about the reaction network without regard to the underlying 
differential equations or rate parameters. This is the subject of Chemical Reaction 
Network Theory [aimiiiiiiiiiig]. 

One advantage of deducing conservation laws in this way is that choices for vi and 
V 2 lead to other versions of the conservation laws, ones which may be less intuitive. 
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For example Vi = V 2 = ^ leads to a law for the conservation of the total number 

of monomers. While Vi = 1 = —V2 shows that if the difference in monomer concen¬ 

trations changes, it is accompanied by a two fold change in the difference between 
homodimer concentrations. 

3.2.2. Experimental Observations. As stated in EH the addition of atoms 
to create fusion monomer M 2 is believed to only affect its diffusion through the chro¬ 
matography column, without appreciably influencing its dimerization. This implies 
the following. 

(3.9a) ki- = k 2 ,- = fca,- , 

(3.9b) fci,+ = k2,+ = k-i^+jl . 

The second equality in (I3.9bl) follows from a simple probability argument. 

Suppose there are N molecules of each of two types occupying some volume. Sup¬ 
pose further that a collision randomly occurs between two molecules. If the molecules 
are of different types, then there are possible collisions. If the molecules are of the 
same type, then there are N{N — I)/2 possible collisions. For large values of N, these 
numbers differ by approximately a factor of 2. Because chemical reactions are due to 
random collisions, and because concentration values are simply numbers of molecules 
scaled by volume, the second equality in ()3.9bp follows. 

At the start of the experiment separate solutions of {Mi,Dii) and (M 2 ,£> 22 ) 
are prepared and allowed to reach equilibrium. It’s known that at equilbrium the 
concentrations of monomers are several orders of magnitude smaller than the dimers. 
These solutions are then combined, with a small number of additional monomers being 
freed initially. Because the amount of free monomers can’t be determined at the start 
of the experiment, and because their concentrations are believed to be about 1000 
times lower than those of dimers, the initial concentration of monomers is assumed 
to be zero, i.e., Mi (to) = M 2 (to) = 0. 

(3.10a) 2Dii{to) + Di 2 {to) + Mi (to) = 2Dii(to) + Ili 2 (to) = 2dii + di 2 

(3.10b) ‘2D22{to) + Di2{to) + M2(to) = 2Zl22(to) + £^ 12 (^ 0 ) = 2(^22 + di2 

Assumption 3. Initial concentrations of monomers are negligible. 

The remaining initial concentrations are known. After these reductions, system 
(13.81) now has three unknown concentrations, {Du, D 22 , D 12 }, and two unknown pa¬ 
rameters, {fc+,fc_}. For simplicity of notation, the dimer concentrations are hence- 


forth denoted by {x,y,z} respectively. 


(3.11a) 

— = k^(2dii + di 2 — 2x — 
dt 

x(to) = dll 

(3.11b) 

^ = k+{ 2 d 22 + di 2 -2y- zf - k-y, 
dt 

y{to) = C?22 

(3.11c) 

dz 

— = 2k^{2dii + di 2 — 2x — z){ 2 d 22 + di 2 — 2y — z) — k-Z, 
dt 

z{to) = di 2 


It’s important to note that standard nonlinear regression techniques used by bio¬ 
chemists are able fit exponential models to the type of data displayed in Figure 12.31 
Due to the disparity between the time scales for association and dissociation, this 
regression gives estimates of the dissociation parameter k- only. These exponential 
fits agree somewhat with available experimental data on dimer concentrations. How¬ 
ever, without experimental measurements of monomer concentrations, information 
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on association was thought to be lost, making estimates of /c_|_ unattainable. The 
methods described in subsequent sections give the first estimates of k~^ from available 
chromatograms. 

4. Computational Modeling. 

4.1. Numerical ODE Methods. Analytical solutions to nonlinear ODE are 
rarely possible to obtain. To compute numerical approximations of these solutions, the 
software package MATLAB is used [33] . The types of ODE being solved with the range 
of parameters used and the numerical error tolerances required lead to the numerical 
solutions exhibiting stiffness, that is, the dynamic step size adjustments of the explicit 
Runge-Kutta solver ode45 are made unnecessarily small to achieve stability. So called 
stiff solvers based on Rosenbrock methods or backwards differentiation formulae are 
more efficient choices, e.g., MATLAB’s ode23s and odel5s. The larger stability regions 
of these methods allow accuracy to be achieved with larger step sizes. [CTlTfll48] . 

4.2. Parameter Fitting. Performing numerical simulations requires some 
knowledge of the rate parameters and initial conditions. As Murray [33 p. 417] stated, 
“... parameter estimates ... are essential in any practical application of a model to a 
specific biological problem.” 

To estimate rate parameters, a method of computing the error between the em¬ 
pirical data and the numerical solution is needed. One common and simple choice is 
the sum of square errors and the corresponding root mean square erroqj. See [1] for 
a detailed discussion of the statistical basis for least squares minimization applied to 
parameter estimation problems. 

Let Xi be the experimental value approximated by numerical solution x{t) at time 
ti- Define {yi,Zi} similarly. Define the sum of square errors and the root mean square 
error. 


SSE = ^ {{x{ti) - XiY + {y{u) - Uif + {^{U) - Zi)'^) , RMSE = v^SSE/3n. 

To each pair of independent variables (k+,k-) corresponds a value of the dependent 
variable, SSE. In this way an error surface is generated. An algorithm is needed which 
minimizes the error, i.e., estimates the location of the global minimum on the error 
surface. Such algorithms successively choose parameter values to diminish the SSE 
until it reaches desired tolerances. 

Examining Figure l4Tl reveals that, for a typical dimer exchange assay, the local 
minimum on the error surface is located in a basin which is steep in the vertical 
direction but shallow in the horizontal direction. Indeed, in the figure fc+ varies over 
[100, 700] while fc_ varies over [0.29, 0.34], an aspect ratio of 12,000. When a minimum 
is trapped in such a narrow region, optimization algorithms which rely on gradient 
estimates tend to perform poorly or fail altogether. A famous example demonstrating 
this difficulty is the Rosenbrock Banana Function [40]. 

One algorithm which is effective for this type of optimization is the Nelder-Mead 
Simplex Method [SHIM]- It’s implemented by MATLAB’s fminsearch. Given a start¬ 
ing guess for the parameters, the simplex algorithm deterministically selects a succes¬ 
sion of parameter pairs which tend towards a local minimum on the error surface. 


*The RMSE is related to the I 2 norm which is sometimes replaced with other Ip norms or the 
cosine measure. A variety of other special purpose norms are available El Cl]. 
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minimum RMSE= 0.0028 



Fig. 4.1. A contour plot of the error surface for a typical dimer exchange experiment. The 
minimum occurs at (/c+, fc—) ~ (418, 0.314). 




Fig. 4.2. The best fit numerical solutions for two typical dimer exchange assays. Top: Data 
from Table \A.2\ leads to best fit parameters (/c_j_,fc_) = (284,0.238). Bottom: Data from Table lAT^ 
leads to best fit parameters (/c+,/c_) = (439,0.301). 
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When multiple local minima are suspected, multi-start methods |29l I50j or simu¬ 
lated annealing El US] are often used to search for the global minimum. Such methods 
were not applied to dimer exchange, but their frequent use in applications motivates 
the following brief descriptions. 

Suppose it is known that a surface has multiple local minima contained in some 
finite domain. Starting from a single initial parameter estimate and proceeding with 
a minimization algorithm leads to a numerical estimate of a single local minimum. 
If the numerical experiment is repeated with different start values, it’s possible that 
the same local minima, or a different one, will be found. Repeating this process with 
many initial parameter guesses spread over parameter space is the idea of a multi-start 
method. If a visual inspection indicates the location of minima, then appropriate 
initial estimates are easy to obtain. However if the parameters are spread over a large 
multidimensional domain, such a priori knowledge of the locations of minima is not 
easily obtained. Finding all local minima may require a shotgun approach to be used, 
whereby numerous start values are selected with sufficient density to suggest that all 
possible local minima will be found. A multi-start method for a high dimensional 
problem can be computationally expensive. 

Multi-start is an dense deterministic search of all relevant parameter space, while 
simulated annealing is a random walk over the subregions of parameter space pre¬ 
sumed likely to contain global minima. Annealing is a metallurgical technique for 
creating stable alloys by carefully heating and cooling a mixture of metals accord¬ 
ing to a prescribed temperature schedule. Alloys cooled rapidly have their molecules 
locked into a local minimum energy state that is often far from globally optimal. By 
the application of a designed cooling schedule, the molecules gradually come to their 
equilibrium locations. Heat promotes large molecular deviations to new unexplored 
energy configurations. An effective combination of heating and cooling allows suffi¬ 
cient opportunity for large molecular deviations (to find regions with possible global 
minima) and also for exploration of the depths of the energy minima. 

Inspired by physics, simulated annealing is a minimization process where the next 
set of coordinates to be used in the minimization process are chosen according to the 
outcomes of previous simulations but with the addition of noise to mimic thermal 
effects. The amount of noise added at each round of coordinate selection follows a 
temperature schedule chosen to allow a balance between finding areas with minima 
(high temperature random large deviations) and exploring the depths of the minima 
(low temperature nearly deterministic small deviations). Multi-start and simulated 
annealing are frequently combined. 

MATLAB’s Global Optimization Toolbox enables applications of multi-start, sim¬ 
ulated annealing, genetic algorithms, and other global minimization methods [80] . 

An examination of a large portion of the error surface suggests that dimer ex¬ 
change admits a single local minimum which may be approximated without global 
methods. 

Remark I. The parameter fitting method used for the dimer exchange assay was cho¬ 
sen for simplicity. In practice, Kalman Filtering and other methods [22l [25] are used 
to not only estimate parameters but to suggest appropriate mathematical models for 
systems which may be not be completely understood and to indicate which laboratory 
experiments should be performed next. 

4.3. Sensitivity Analysis. 

4.3.1. Local Sensitivity. It’s important to understand how sensitive a system 
is to alterations of the parameters. There are many potential sources of error, possibly 
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leading to variation in parameter estimates. 

1. Experimental data may not be adequate, e.g.^ the nature of chromatography 
causes the hrst time point of the dimer exchange assay to be recorded at t = 1/30 
hours instead of t = 0. This causes inaccuracy in the initial conditions. 

2. Experimental data is never perfectly reproducible so that separate experi¬ 
ments lead to different parameter estimates. 

3. Numerical solvers give approximate solutions, up to specified error tolerances. 

The second source of error is considered in §5] If a model is highly sensitive to changes 
in parameters, small experimental errors may lead to large deviations in the best fit 
parameters, even though large differences in experimental outcomes weren’t observed. 
If a model is insensitive to changes in the parameters, then the experimental outcome 
may not be appreciably altered by those parameters. Both scenarios may imply that 
certain components of the model should be replaced, modified, or removed. 

Suppose an ODE is solved multiple times for x(t) with different, but similar, 
values of a model parameter, k. Small changes in x(t) will result due to small changes 
in k. 

Change in x Sx dx 
Change in k 5k dk 

Consider the following initial value problem. 

UT 

(4.1) —=f{x,t]k), x{0)=XQ(k). 

dt 

The function / and the initial data xq depend on a parameter k, e.g., an autocatalysis 
model with initial data at the concentration of half-maximal production rate, x' = 
x^ jix^ -I- fc^) with x(0) = k. 

Definition 2. Let x{t;k) be the solution to initial value problem (14.11) . The local 
sensitivity of x with respect to k is defined as j/(t; k) = ^. 

Theorem 2. The local sensitivity satisfies the following ODE. 

dt dx ^ dk 

Proof. By the chain rule and Clairaut’s theorem, 

dy d dx d dx df df df 

dt dt dk dk dt dk dx^ dk 

□ 

As an example, sensitivity analysis is applied to system (13.51) with y = j^. 

^=k+ (Mo + 2Do - 2Df - k_D, D{0) = Do, 

at 

$ = (-4fc+ (Mo + 2Do - 2D) -k-)y + 4fc+ (Mq + 2Do - 2D) , y(0) = 1. 

dt 

Local sensitivity analysis may be applied to systems of ODE with multiple parameters. 
Specialized numerical methods have been designed for both local and global sensitivity 
analysis uniiia- In general, n equations with m parameters leads to a system of n(m + 
1) equations for the solutions and their parameter sensitivities. See Equations (IB.II) in 
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Appendix |B] for the nine equations used in the sensitivity analysis of solution {a;, y, z} 
with respect to parameters 

Of interest is how a relative change in the solution would arise in response to a 
relative change in a parameter. That is, k will be perturbed, the change in x will be 
found, and ratios of relative changes will be computed. 

Definition 3. The relative local sensitivity of solution x with respect to parameter 
k is ky/x. 

The motivation of this definition is intuitive. 

Relative change in a; ^ k 6x k 

Relative change in k ^ x 5k x^ 

An alternative expression for the relative sensitivity is, by the chain rule . See [15] 
for an application. 

Figure 14.31 shows relative local sensitivities of solutions {x, y, z} to changes in 
parameters {k+,k-}. This figure reveals that relative sensitivity of z to k+ peaks 
during the first hour. Unfortunately, during this initial phase only one data point 
was recorded by the experimentalists. Limitations of the experimental protocol make 
recording more data in this interval challenging. Nonetheless, this new observation 
suggests that more accurate estimates of may be found by restricting simulation 
and data fitting to the early part of the experiment. 

The figure was produced by numerically solving the ODE for the local sensitiv¬ 
ities using standard MATLAB ODE solvers. Users of MATLAB’s SimBiology tool 
may compute sensitivities automatically without the need for explicitly stating the 
governing ODE [55] . 

4.3.2. Global Sensitivity. Local sensitivities are computed by means of deriva¬ 
tives of outcome variables with respect to single parameters, e.y., dx/dk+. As such, 
these sensitivities are most informative when changes in parameters and outcomes are 
sufficiently small so as to be well approximated by infinitesimals and when parame¬ 
ters are sufficiently independent so that changes in outcomes due to each parameter 
may be examined separately. Many biological experiments show large deviations in, 
and nonlinear interactions between, parameters. A global approach to the study of 
sensitivity is often appropriate. 

While local sensitivity can be analyzed by solving additional ODE, global sen¬ 
sitivity has been defined in many ways and sophisticated statistical tools are often 
required to analyze it [44l|52]- Such analyses typically proceed in three stages. Firstly, 
knowledge and intuition of the possible ranges for the values of the n parameters are 
used to predetermine the parameter set of interest. Next, a finite subset of n-tuples 
are sampled from this set and the computational simulation is run n times using those 
parameters. Finally, the n simulation outcomes are statistically analyzed to determine 
how alterations to each parameter affected the outcomes. 

The Fourier Amplitude Sensitivity Test (FAST) is a method for studying global 
sensitivity uni HI]. In its simplest form, FAST proceeds in three stages. First, the 
rate constants are varied parametrically at different frequencies. 

/c+(s) = sin(a;+s), k-{s) = sin(w_s). 

Here k^ are representative values of A:±, s is a parameter defined on [—7r,7r], and uj± 
are integers. Then, for a specified time and for all s, an outcome variable, e.g., x(s), 
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Fig. 4.3. The data from Table fXl leads to best fit parameters [fc+,fc_] = [284,0.238]. ^ee 
Figurethese values, relative sensitivities of {x^y^ z} were computed with respect to 
(top) and k— (bottom). Top: At hour 20, the relative sensitivities to fc+ in decreasing order are 
for dimer (red), heterodimer (purple), and fusion dimer (blue). Bottom: at hour 20, the relative 
sensitivities to k— in decreasing order are for fusion dimer (blue), heterodimer (purple), and dimer 
(red). 


is computed and its Fourier coefficients are found. 

1 

= -Ao + ^ A, cos(js) + Bj sin(js), 

I r i r 

(4.2) = — / 2 ;(s) cos(js) ds, B, = — / x{s) sin(js) ds. 

27^ J-T, Stt 

Parseval’s theorem shows that the sum of squares of these coefficients is proportional 
to the variance of x{s). 

Var(a:) = — x{sf ds - x(s) dsj = - ^ + B]) 























Modeling, Simulating, and Parameter Fitting in Biochemistry 


15 


Finally, by summing only those squares of coefficients corresponding to frequencies 
which are multiples of the w’s, the variance due to the changes in a particular rate 
constant can be discerned. This motivates the computation of a sensitivity index, 
which is the ratio of the variance due to one parameter to the total variance. For 
example, 

By definition, sensitivity indices are elements of [0,1]. The selection of G±, uj±, and 
a finite set of s values to accurately approximate the integrals are subtle issues 
addressed elsewhere [iniiiHiiis]. 

For any time point of interest, the sensitivity indices determine which portion 
of the variance can be attributed to each parameter. By repeating the analysis for 
different times, global sensitivity indices may be plotted in a manor similar to the 
relative local sensitivities of Figure 1131 

The interested reader is referred to [28] and [38], which provide background on 
global sensitivity and whose authors supply open source software. Dimer exchange 
was studied using MATLAB routines from the former. These routines implement the 
authors’ extended version of the algorithm, eFAST [^. Users of MATLAB’s Simulink 
have access to a suit of tools appropriate for the study of global sensitivity EH. 

Applying eFAST to dimer exchange results in Figure 14.41 As with relative local 
sensitivity, global sensitivity to alterations in peaks during the hrst hour of the 
experiment, highlighting the possible utility of a protocol which would allow addi¬ 
tional data to be collected during the initial phase. These results have motivated new 
experimental design which, if successfnl, may enable data collection over shorter time 
intervals. 

Comparing the vertical scale of the two plots in Figure |U4l suggests much greater 
sensitivity of model outcomes to large deviations in k-. The relative insensitivity of 
outcomes to large deviations in fc_|_ poses difficulties when estimating best fit parame¬ 
ters. Variance in parameter estimates is addressed in !]5l 

5. Data Modeling. The effect of variations in the experimental data on the 
computed optimal rate constants can be studied using data analytic methods. The 
sensitivity of computational outcomes to parameter values was highlighted in il4.3l 
Fitting data to a single experiment gives rate constants which describe that single 
experiment. By examining a set of n experiments, a set of n pairs of rate constant 
estimates can be found. If the experimental outcomes are consistent, and if the model 
and numerical simulations are accurate, then the computed rate constants should be 
consistent across all experiments. It’s necessary to give analytical meaning to this 
vague expectation of consistency. 

The data methods employed below fall into two categories, 1) graphical methods 
which give qualitative information about the rate constants and 2) analytical methods 
which give quantitative information. 

The left of Figure lOI shows a scatter plot of the optimal rate constants for the 
raw data sets from 18 dimer exchange assays. It is clear that fc+ estimates vary greatly. 
Indeed, it should be noted that values in excess of 10® cause the numerical ODE 
solver to fail to achieve desired error tolerances. So these anomalously large values of 
are dubious. 

The raw data used to obtain these parameter estimates exhibits artifacts which 
suggest to experimentalists that some loss of protein is occurring during the assay. A 
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Fig. 4.4. Global sensitivity analysis using FAST with parameter ranges E [200,350] and 
k— £ [0.2,0.35]. The sensitivity index for a parameter is the ratio of variance in an outcome variable 
at the parameter’s frequency (and harmonics) to the total variance in the outcome variable. Top: 
At hour 20, the kj^ sensitivity indices in decreasing order are for heterodimer (purple), fusion dimer 
(blue), and dimer (red). Bottom: At hour 20, the k- sensitivity indices in decreasing order are 
for heterodimer (purple), dimer (red), and fusion dimer (blue). The latter two of which are nearly 
visually indistinguishable. 


corrective processing of the concentration data was used to account for these losses. 
At each time step, the data was renormalized so that total protein of LBD type and 
of Fusion type are invariant in time. The resulting parameter fits to processed data 
are shown in on the right of Figure 15.11 The new estimates for are within the 
limitations of the numerical ODE solver. 

Cluster Analysis m na na EH El] may be used to qualify what is meant by 
the terms outlier or atypical. Once identified, these outliers should be examined to 
determine the cause of their anomalous nature. MATLAB’s clusterdata function uses 
a hierarchical clustering method, though it can employ other methods such as k-means 
and Gaussian mixtures models. 

The left plot of Figure EH shows a cluster of experiments (red pentagons) which 
do not fit the mathematical model because of suspected protein loss, highlighting 
the need for an improved model or corrective processing of the data. The right plot 
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demonstrates the effect of processing to account for protein loss. Cluster analysis 
reveals that 14 of 18 parameter pairs are in the same cluster (grey circles). 
Definition 4. Let {pk} be the set of data points, {ck} be the set of clusters the 
data has been separated into, and fi{pi,pj) be a measure of distance between two 
data promts. Given a point pi from cluster Cki containing |cfej elements, the silhouette 
widtn^ of Pi is defined by 


S{Pi) = 


0 if |cfej = 1 


where Ai is the mean distance from pi to all points in the same cluster and Bi is the 
minimum mean distance from pi to all points in other clusters. 


A,, = 


Cfci - 




Pi 


Bi — min 

kitki 


(ra.?,"'"'''’'’ 


Pj£Ck 


Note that S{p) G [—1,1] for all p. A silhouette value near +1 indicates that a point 
is well matched to its assigned cluster and poorly matched to the others. A negative 
silhouette value indicates that a point may have been assigned to the wrong cluster. 

To create Figure 15.11 silhouette widths were computed for various numbers of 
clusters. Choosing four clusters gave an optimal value for the average silhouette 
width. The reader is encouraged to investigate MATLAB’s silhouette and evalclus- 
ters functions. For readers familiar with the R statistical programming language, 30 
methods for choosing the number of clusters have been incorporated into the package 
NbClust [S]. 




0.4 

0.3 

0.2 

0.1 




••• 



10 ^ 10*5 10 ® 10 “ 10 ^ 10 ^ 10 ® 10 ^ 


k+ 


k+ 


Fig. 5.1. Left: Cluster analysis applied to raw data. Right: Cluster analysis applied to processed 
data from Table 1C. II in the appendix. 

Variation in the computed parameter estimates can arise from many sources. Al¬ 
terations may have been made to the experimental protocol or the post processing 
of the data. These changes may have been made intentionally by the experimenters 
or unintentionally due to the difficulty of controlling experimental conditions. Even 
given consistent data, the mathematical model based on this data may be incomplete 
and unable to capture phenomena exhibited in all experiments. 

Using raw data, the computational model of dimer exchange failed to fit the ex¬ 
perimental model. This is explained by Assumption [5J which stated that losses of 

tThis is the original definition, given by Rousseeuw [42]. Alternate definitions, including the one 
used by MATLAB, assign 5 = 1 to singleton clusters. 
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protein would be ignored when forming the mathematical model. This is in conflict 
with the experimentalist’s suspicion of protein loss. Corrective processing of the data 
was used by experimentalists in an attempt to compensate for the protein loss hypoth¬ 
esis. An alternative would be to reformulate the model to include protein loss. Data 
processing and model alterations will be discussed further in lj6l 

Visual inspection can be deceiving, leading to dubious subjective conclusions. 
Changing the scales of the axes in Figure 15.11 could make the rate constants look 
more or less associated. The clustering method used here also requires subjective 
specification of the number of clusters to search for. Although sophisticated methods 
of cluster number selection exist, ambiguity and subjectivity persist. 

Objective analytical methods should be used to compliment intuition gained from 
graphical displays and cluster analysis. After performing a cluster analysis, the 14 
pairs of rate constants in the main cluster (grey circles) were used to compute the 
statistical means, standard deviations, and coefficients of variation. 


014,^ = (Jk+/^J^k+ ~ 1-78 , 
CVk_ = Ok_/y^k_ ~ 0.223. 


[ik^ « 628, 

iik_ Ri 0.243, 


Gk^ Ra 352, 
Gk_ - 0.0540, 


The standard deviations suggest that fc_|_ estimates may be much more variable than 
those for fc_, consistent with conclusions drawn from a visual inspection of Figure [5TTJ 
However, the scales for the rate constants are different by roughly three orders of 
magnitude. 

To compare such dissimilar data sets, the standard deviation as a percentage of 
the mean is often used. These coefficients of variation [sa [39] show the forward 
rate constants to deviate from the mean about eight times as much as the backward 
constants. The statistical measures used here provide a superficial glimpse into the 
dimer exchange data. Much deeper analyses of parameter estimates are common [1]. 
Future work aims to reduce the variance in by collecting more data from the first 
hour of the assay, to address the peak parameter sensitivities described in 114.31 

6. Discussion. Given the preponderance of wet-lab biology data, mathematical 
methods are needed to analyze and give meaning to the data. The process employed 
above is an example of one such type of analysis. 

The first stage is the acquisition of experimental data. Information from exper¬ 
imenters and trusted mathematical techniques are then used to form an analytical 
model of the mechanism underlying the experimental protocol. The model is reduced 
by utilizing biological knowledge and analytical techniques. The model is then simu¬ 
lated in order to fit parameters to the empirical data, using sensitivity analysis as a 
guide toward the most relevant parts of the experiment. Data analytic methods may 
then detect atypical experiments and allow statistical inferences of the trustworthiness 
of parameter estimates 

Accurate models and parameter estimates should not be considered the ultimate 
goal of this process. To complete the cycle of experimentation and modeling, conclu¬ 
sions drawn from the data should be used to inform the experimental model. 

Indeed, the outliers may highlight breaches of experimental protocol. Large vari¬ 
ance may point towards insufficient data collection or an ineffective experimental 
technique. Analytical models with poor fit to the empirical data may indicate that 
the model is inadequate or that the underlying biology is misunderstood. 

The present study has revealed the need for biochemists to process the data to 
account for protein loss. It also motivates the mathematician to rebuild models to 
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account for losses (aggregation, non-specific binding, degradation, etc.) so that such 
data processing isn’t required. This is the subject of future collaborative work. 

The models considered here were presented to biochemists studying estrogen re¬ 
ceptors. The outcomes strengthened their hypothesis that protein aggregation was 
occurring prior to injection of the mixture into the chromatography column. This 
hypothesis has been incorporated into a more complex mathematical model not pre¬ 
sented here. The researchers are “extremely interested” in learning the outcome of this 
new model, as protein aggregation is an important area of contemporary research [B] . 

Nonlinear regression techniques used by biochemists are able fit exponential mod¬ 
els to the type of data displayed in Figure 12.31 Due to the disparity between the 
time scales for association and dissociation, this regression gives estimates of the dis¬ 
sociation parameter fc_ but not of k^. These exponential fits agree with available 
experimental data on dimer concentrations. However, without experimental measure¬ 
ments of monomer concentrations, information on association was assumed to be lost. 

The analytical methods demonstrated above give the first estimates of k'^. 
Though monomer concentrations aren’t discernible, their presence evidently leaves 
a shadow in the dimer concentration data. The models and methodology above are 
the first to shed light on these shadows. Motivated by sensitivity analysis of het¬ 
erodimer concentrations with respect to k +, one goal of future experiments is to find 
ways to collect data over shorter time intervals. 

The realm of collaboration between mathematicians and biologists extends beyond 
the analysis of data to support existing hypotheses. The goal of collaboration is to 
add mathematical analysis to the set of tools available to biologists, that is, to enable 
mathematics to be a new type of laboratory equipment. 


Appendices 

A. Experimental Data. In the tables of this section, the quantities Du, D 22 , 
and Di 2 are the experimentally measured concentrations of LED homodimer, fusion 
homodimer, and heterodimer, respectively. The units of concentration are micromo¬ 
lar. Time is recorded in hours. Limitations of the chromatography process make it 
impossible to record data before two minutes, i.e., 1/30 hour. 

To produce this data, raw chromatograms were processed to account for protein 
loss. At each time step, the data was renormalized so that total protein of LED type 
and of Fusion type are invariant in time. All data is from unpublished work from the 
Brandt Lab [Ml 155] . 


Table A.l 

Data from the dimer exchange assay plotted in Figure V2.31 


t 

1/30 

1 

2 

3 

4 

5 

7 

9 

11 

13 

15 

18 

23 

29 

35 

Dll 

1.88 

1.65 

1.46 

1.33 

1.23 

1.18 

1.10 

1.06 

1.03 

1.02 

1.00 

1.00 

1.01 

0.99 

0.99 

D 22 

2.08 

1.86 

1.67 

1.53 

1.44 

1.39 

1.30 

1.26 

1.24 

1.22 

1.20 

1.19 

1.21 

1.19 

1.19 

Di2 

0.16 

0.62 

1.00 

1.27 

1.46 

1.57 

1.72 

1.81 

1.86 

1.90 

1.93 

1.93 

1.91 

1.96 

1.96 
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Table A .2 

Data from the dimer exchange assay plotted in Figure \4 ■ S\ 


t 

1/30 

1 

2 

3 

4 

5 

6 

7 

8 

9 

10 

11 

12 

14 

16 

18 

20 

On 

1.71 

1.56 

1.40 

1.28 

1.14 

1.17 

1.14 

1.06 

1.10 

1.10 

1.07 

1.09 

1.06 

1.03 

1.06 

1.06 

1.05 

O22 

1.75 

1.59 

1.43 

1.31 

1.17 

1.20 

1.17 

1.09 

1.13 

1.13 

1.10 

1.13 

1.09 

1.06 

1.09 

1.09 

1.08 

O12 

0.62 

0.93 

1.26 

1.50 

1.78 

1.71 

1.78 

1.93 

1.86 

1.86 

1.91 

1.87 

1.93 

1.99 

1.94 

1.94 

1.95 


Table A.3 

Data from the dimer exchange assay plotted in Figure \4 . 2| 


t 

1/30 

1 

2 

3 

4 

5 

6 

7 

8 

9 

10 

11 

12 

14 

16 

18 

20 

On 

1.52 

1.32 

1.20 

1.06 

1.00 

0.94 

0.89 

0.88 

0.87 

0.87 

0.86 

0.81 

0.76 

0.82 

0.84 

0.82 

0.83 

O22 

2.25 

2.05 

1.94 

1.79 

1.73 

1.67 

1.62 

1.61 

1.60 

1.60 

1.59 

1.55 

1.49 

1.55 

1.57 

1.55 

1.56 

O12 

0.47 

0.88 

1.11 

1.39 

1.52 

1.54 

1.74 

1.76 

1.77 

1.78 

1.80 

1.89 

1.99 

1.87 

1.83 

1.87 

1.87 


B. Local Sensitivities. Define local sensitivities by 


dx 


Xi = 


X2 = 


dx 


yi = 


dy 


y2 = 


dy 


Zi = 


dz 


Z2 = 


dz 


*+’ ^ dk+' ^ dk_ dk+' dk_ 

The differential equations governing local sensitivity with respect to {fc+, fc_} are 
dxi 


(B.la) 

(B.lb) 

(B.lc) 

(B.ld) 

(B.le) 

(B.lf) 


dt 

dX2 

dt 

dyi 

dt 

dy2 

dt 

dzi 

dt 


dz' 


■- {—4:k+(2di + di 2 —2x — z) — k-)xi + (2di + di 2 —2x — z)^, 

■- (—4k+(2di + di 2 — 2x — z) — k-)x 2 — x, 

■■ (-4fc+(2d2 + di 2 - 22 / - z) - fc_) 2 /i + ( 2^2 + di 2 - 2// - zf, 

: (-4fc+(2d2 + di 2 -2y- z) - fc-) 2/2 - y, 

(—4fc+(di + d 2 + di 2 — X — y — z) — fc_)zi 
+ 2(2di + di 2 —2x — z){ 2 d 2 — 2y — z), 


—— = (-4fc+(di + d 2 + di 2 - X - y - z) - fc_)z 2 - z, 
dt 


with Xi{to) = 0, Viito) = 0, and Zi{to) = 0. 


C. Best Fit Parameters for 18 Experiments. Data fitting routines were 
used to compute pairs of best fit parameters for 18 sets of processed dimer exchange 
data. The parameters and fc_ have units of l/micromolar*hours and 1/hours, 
respectively. 


Table C.l 

Parameter estimates for processed data corresponding to Fiaure V5,l\ 


k+ X 10-2 

2.39 

2.63 

2.40 

2.89 

2.98 

2.88 

3.08 

2.64 

3.68 

2.83 

2.45 

4.40 

3.04 

2.51 

3.82 

3.04 

3.90 

2.42 

fc _ X IQi 

2.95 

2.38 

2.98 

3.09 

3.05 

2.16 

1.94 

3.01 

2.23 

2.59 

2.38 

3.14 

1.51 

1.81 

1.46 

1.73 

1.10 

2.44 
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D. MATLAB Algorithms. The primary algorithms used to simulate models 
and process data for the dimer exchange assay are listed and described below. 
odel5s This implicit solver is useful for solving ODE systems with widely varying 
time scales. It is very commonly used in mathematical biology. Compare to 
the explicit solver ode45, which is more useful for problems which are not stiff. 
Use odeset to specify conditions such as error tolerances, 
fminsearch This minimization method often succeeds when gradient-based methods, 
such as Isqcurvefit, fail. Use optimset to specify conditions such as error 
tolerances. 

clusterdata This is an agglomerative hierarchical clustering routine packaging to¬ 
gether several MATLAB functions. Its high level nature makes it simple to 
use. The low level details may be explored through its options or the related 
functions linkage and cluster. 

silhouette The silhouette value for each point is a measure of how similar that point 
is to points in its own cluster, when compared to points in other clusters. 
Silhouettes can be useful for determining an appropriate number of clusters 
to seek. 

evalclusters A routine which, given a clustering algorithm, determines an appropri¬ 
ate choice for the number of clusters by attempting to optimize silhouette 
widths, gap statistics, or other criteria. 
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